NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Incentivize without Bonus: Provably Efficient Model-based Online Multi-agent RL for Markov Games

Yang, Tong; Dai, Bo; Xiao, Lin; Chi, Yuejie (July 2025, PMLR)

Multi-agent reinforcement learning (MARL) lies at the heart of a plethora of applications involving the interaction of a group of agents in a shared unknown environment. A prominent framework for studying MARL is Markov games, with the goal of finding various notions of equilibria in a sample-efficient manner, such as the Nash equilibrium (NE) and the coarse correlated equilibrium (CCE). However, existing sample-efficient approaches either require tailored uncertainty estimation under function approximation, or careful coordination of the players. In this paper, we propose a novel model-based algorithm, called VMG, that incentivizes exploration via biasing the empirical estimate of the model parameters towards those with a higher collective best-response values of all the players when fixing the other players’ policies, thus encouraging the policy to deviate from its current equilibrium for more exploration. VMG is oblivious to different forms of function approximation, and permits simultaneous and uncoupled policy updates of all players. Theoretically, we also establish that VMG achieves a near-optimal regret for finding both the NEs of two-player zero-sum Markov games and CCEs of multi-player general-sum Markov games under linear function approximation in an online environment, which nearly match their counterparts with sophisticated uncertainty quantification.
more » « less
Free, publicly-accessible full text available July 13, 2026
In-Context Learning with Representations: Contextual Generalization of Trained Transformers

Yang, Tong; Huang, Yu; Liang, Yingbin; Chi, Yuejie (December 2024, 38th Conference on Neural Information Processing Systems)

Full Text Available
In-context learning with representations: Contextual generalization of trained transformers

Yang, Tong; Huang, Yu; Liang, Yingbin; Chi, Yuejie (December 2024, Advances in Neural Information Processing Systems (NeurIPS))

Full Text Available
Federated Natural Policy Gradient and Actor Critic Methods for Multi-task Reinforcement Learning

Yang, Tong; Cen, Shicong; Wei, Yuting; Chen, Yuxin; Chi, Yuejie (December 2024, 38th Conference on Neural Information Processing Systems)

Full Text Available
The Nf CF3 contribution to the non-singlet splitting function at four-loop order

https://doi.org/10.1016/j.physletb.2023.138427

Gehrmann, Thomas; von Manteuffel, Andreas; Sotnikov, Vasily; Yang, Tong-Zhi (February 2024, Physics Letters B)

Full Text Available
Complete Nf2 contributions to four-loop pure-singlet splitting functions

https://doi.org/10.1007/JHEP01(2024)029

Gehrmann, Thomas; von Manteuffel, Andreas; Sotnikov, Vasily; Yang, Tong-Zhi (January 2024, Journal of High Energy Physics)

Full Text Available
Renormalization of twist-two operators in covariant gauge to three loops in QCD

https://doi.org/10.1007/JHEP04(2023)041

Gehrmann, Thomas; von Manteuffel, Andreas; Yang, Tong-Zhi (April 2023, Journal of High Energy Physics)

Full Text Available
Well‐Posedness in Gevrey Function Space for 3D Prandtl Equations without Structural Assumption

https://doi.org/10.1002/cpa.21989

Li, Wei‐Xi; Masmoudi, Nader; Yang, Tong (August 2022, Communications on Pure and Applied Mathematics)

Full Text Available
Knowledgebra: An Algebraic Learning Framework for Knowledge Graph

https://doi.org/10.3390/make4020019

Yang, Tong; Wang, Yifei; Sha, Long; Engelbrecht, Jan; Hong, Pengyu (June 2022, Machine Learning and Knowledge Extraction)

Knowledge graph (KG) representation learning aims to encode entities and relations into dense continuous vector spaces such that knowledge contained in a dataset could be consistently represented. Dense embeddings trained from KG datasets benefit a variety of downstream tasks such as KG completion and link prediction. However, existing KG embedding methods fell short to provide a systematic solution for the global consistency of knowledge representation. We developed a mathematical language for KG based on an observation of their inherent algebraic structure, which we termed as Knowledgebra. By analyzing five distinct algebraic properties, we proved that the semigroup is the most reasonable algebraic structure for the relation embedding of a general knowledge graph. We implemented an instantiation model, SemE, using simple matrix semigroups, which exhibits state-of-the-art performance on standard datasets. Moreover, we proposed a regularization-based method to integrate chain-like logic rules derived from human knowledge into embedding training, which further demonstrates the power of the developed language. As far as we know, by applying abstract algebra in statistical learning, this work develops the first formal language for general knowledge graphs, and also sheds light on the problem of neural-symbolic integration from an algebraic perspective.
more » « less
Full Text Available
Precise Error Estimation for Sketch-based Flow Measurement

https://doi.org/10.1145/3487552.3487856

Chen, Peiqing; Wu, Yuhan; Yang, Tong; Jiang, Junchen; Liu, Zaoxing (November 2021, Proceedings of the 21st ACM Internet Measurement Conference (IMC '21))

As a class of approximate measurement approaches, sketching algorithms have significantly improved the estimation of network flow information using limited resources. While these algorithms enjoy sound error-bound analysis under worst-case scenarios, their actual errors can vary significantly with the incoming flow distribution, making their traditional error bounds too "loose" to be useful in practice. In this paper, we propose a simple yet rigorous error estimation method to more precisely analyze the errors for posterior sketch queries by leveraging the knowledge from the sketch counters. This approach will enable network operators to understand how accurate the current measurements are and make appropriate decisions accordingly (e.g., identify potential heavy users or answer "what-if" questions to better provision resources). Theoretical analysis and trace-driven experiments show that our estimated bounds on sketch errors are much tighter than previous ones and match the actual error bounds in most cases.
more » « less
Full Text Available

« Prev Next »

Search for: All records